filmov
tv
yannic kilcher
1:09:00
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
0:00:32
Yannic Kilcher on PhD's for ML #shorts
0:53:02
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
0:27:48
Were RNNs All We Needed? (Paper Explained)
0:01:01
Yannic Kilcher on superintelligence #machineleaning
0:28:23
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
0:10:49
My GitHub (Trash code I wrote during PhD)
0:29:47
Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)
0:27:07
Attention Is All You Need
1:11:58
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)
0:56:16
Flow Matching for Generative Modeling (Paper Explained)
0:37:06
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
0:59:38
JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained)
0:45:44
What is Q-Learning (back to basics)
0:48:53
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)
0:36:15
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)
0:40:40
Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)
0:29:56
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)
0:15:12
No, Anthropic's Claude 3 is NOT sentient
0:57:00
xLSTM: Extended Long Short-Term Memory
0:19:20
GPT-4chan: This is the worst AI ever
1:05:16
Hopfield Networks is All You Need (Paper Explained)
0:48:07
OpenAI CLIP: ConnectingText and Images (Paper Explained)
Вперёд